Improving profile HMM discrimination by adapting transition probabilities.
نویسندگان
چکیده
Profile hidden Markov models (HMMs) are used to model protein families and for detecting evolutionary relationships between proteins. Such a profile HMM is typically constructed from a multiple alignment of a set of related sequences. Transition probability parameters in an HMM are used to model insertions and deletions in the alignment. We show here that taking into account unrelated sequences when estimating the transition probability parameters helps to construct more discriminative models for the global/local alignment mode. After normal HMM training, a simple heuristic is employed that adjusts the transition probabilities between match and delete states according to observed transitions in the training set relative to the unrelated (noise) set. The method is called adaptive transition probabilities (ATP) and is based on the HMMER package implementation. It was benchmarked in two remote homology tests based on the Pfam and the SCOP classifications. Compared to the HMMER default procedure, the rate of misclassification was reduced significantly in both tests and across all levels of error rate.
منابع مشابه
Transition Priors for Protein Hidden Markov Models: An Empirical Study towards Maximum Discrimination
Insertions and deletions in a profile hidden Markov model (HMM) are modeled by transition probabilities between insert, delete and match states. These are estimated by combining observed data and prior probabilities. The transition prior probabilities can be defined either ad hoc or by maximum likelihood (ML) estimation. We show that the choice of transition prior greatly affects the HMM's abil...
متن کاملHidden Markov Models for Remote Protein Homology Detection
Genome sequencing projects are advancing at a staggering pace and are daily producing large amounts of sequence data. However, the experimental characterization of the encoded genes and proteins is lagging far behind. Interpretation of genomic sequences therefore largely relies on computational algorithms and on transferring annotation from characterized proteins to related uncharacterized prot...
متن کاملState-Transition Interpolation and MAP Adaptation for HMM-based Dysarthric Speech Recognition
This paper describes the results of our experiments in building speaker-adaptive recognizers for talkers with spastic dysarthria. We study two modifications – (a) MAP adaptation of speaker-independent systems trained on normal speech and, (b) using a transition probability matrix that is a linear interpolation between fully ergodic and (exclusively) leftto-right structures, for both speaker-dep...
متن کاملHidden Markov Models in Protein Modeling
The use of Hidden Markov Models (HMM) in protein modeling is described. Sequence alignment based on profile HMMs can help identifying protein family members and present some advantages. This possibility is discussed. Introduction. The functional and structural characterization of new proteins can be done by taking advantage of their evolutionary relation with proteins of known structure or func...
متن کاملRecent Topics in Speech Recognition Research at NTT Laboratories
This paper introduces three recent topics in speech recognition research at NTT (Nippon Telegraph and Telephone) Human Interface Laboratories. The first topic is a new HMM (hidden Markov model) technique that uses VQ-code bigrams to constrain the output probability distribution of the model according to the VQ-codes of previons frames. The output probability distribution changes depending on th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of molecular biology
دوره 338 4 شماره
صفحات -
تاریخ انتشار 2004